Using Start/end Timings of Spectral Tran Concatenative Speech

نویسندگان

  • Toshio Hirai
  • Seiichi Tenpaku
چکیده

The definition of “phoneme boundary timing” in a speech corpus affects the quality of concatenative speech synthesis systems. For example, if the selected speech unit is not appropriately match to the speech unit of the required phoneme environment, the quality may be degraded. In this paper, a dynamic segment boundary definition is proposed. In the definition, the concatenation point is chosen from the start or end timings of spectral transition depending on the phoneme environment at the boundaries. For a listening test to compare the naturalness of conventional/proposed methods, 100 Japanese place names were selected randomly and synthesized. The ratio of naturalness was 1 to 3.3 (conventional v.s. proposed) by four subjects.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new distance measure for costing spectral discontinuities in concatenative speech synthesizers

In many modern concatenative speech synthesisers the unit sequence used to synthesise each sentence is determined at runtime by a search algorithm seeking to optimise a multidimensional cost function. One of these costs is usually some form of spectral continuity cost, computed between the end of one segment and the start of the following segment, intended to ensure that the synthetic speech do...

متن کامل

A New Distance Measure for Costing Spectral Discontinuities in Concatenative Speech Synthesisers

In many modern concatenative speech synthesisers the unit sequence used to synthesise each sentence is determined at runtime by a search algorithm seeking to optimise a multidimensional cost function. One of these costs is usually some form of spectral continuity cost, computed between the end of one segment and the start of the following segment, intended to ensure that the synthetic speech do...

متن کامل

An Unit Selection based Hindi Text To Speech Synthesis System Using Syllable as a Basic Unit

Concatenative speech synthesis using phoneme, di-phone and allophone as an elementary unit for Hindi speech synthesis requires significant quality improvement. The naturalness of the state of the art waveform synthesizer is attributed due to the use of syllable as a basic unit. The primary reason for choosing the syllable as a basic unit is that the Indian languages are syllable centered. This ...

متن کامل

Spectral control in concatenative speech synthesis

We report on research in which we increased the degree of spectral control in concatenative synthesis by controlling the formant frequencies of the synthetic speech, as well as the energies in four spectral bands. In addition, we eliminated “points” of concatenation in favor of “regions” of concatenation, by crossfading between the end and the beginning of two speech segments that are part of a...

متن کامل

On the Detection of Discontinuities in Concatenative Speech Synthesis

Last decade considerable work has been done in finding an objective distance measure which is able to predict audible discontinuities in concatenative speech synthesis. Speech segments in concatenative synthesis are extracted from disjoint phonetic contexts and discontinuities in spectral shape and phase mismatches tend to occur at unit boundaries. Many feature sets —most of them of spectral na...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002